Reconstructing protein binding patterns from ChIP time-series
نویسندگان
چکیده
Motivation: Gene transcription requires the orchestrated binding of various proteins to the promoter of a gene. The binding times and binding order of proteins allow to draw conclusions about the proteins’ exact function in the recruitment process. Time-resolved ChIP experiments are being used to analyze the order of protein binding for these processes. However, these ChIP signals do not represent the exact protein binding patterns. Results: We show that for promoter complexes that follow sequential recruitment dynamics the ChIP signal can be understood as a convoluted signal and propose the application of deconvolution methods to recover the protein binding patterns from experimental ChIP time-series. We analyze the suitability of four deconvolution methods: two non-blind deconvolution methods, Wiener deconvolution and Lucy-Richardson deconvolution, and two blind deconvolution methods, blind Lucy-Richardson deconvolution and binary blind deconvolution. We apply these methods to infer the protein binding pattern from ChIP time-series for the pS2 gene. Contact: [email protected] An essential step in gene expression is the initiation of transcription. This process requires the orchestrated recruitment of regulatory proteins to the gene promoter. This leads to the formation of the transcriptional machinery that transcribes a gene from DNA to RNA. Transcription factors assemble on the promoter site, forming sequences of protein complexes on the promoter. Eventually, the protein complex attracts the RNA polymerase that transcribes the gene and finally the promoter is cleared again. During this process the modification of epigenetic marks on the DNA and histones were shown to be essential for the initiation of transcription (Métivier et al., 2006). For various promoters the proteins participating in the complex formation have been identified, yet, determining the exact order and timing of the recruitment events is still a non-trivial task. Extensive ChIP experiments have been used to examine the binding order of proteins and the modification of epigentic marks (Métivier et al., 2006; Lee et al., 2005). Since a large number of cells (≈ 10) are required to perform ChIP experiments (Métivier et al., 2006), such experiments are performed on initially synchronized cell populations with a cleared promoter site. to whom correspondence should be addressed Fig. 1. Visualization of the recuritment network for sequential recrutiment. The recruitment matrix aji circularly links the recrutiment states and each state has only one successor state. Usually ChIP experiments are analyzed heuristically by applying prior knowledge of protein interactions to interpret the form of the ChIP signal. In ? the sequence of maxima and minima in ChIP time-series was used to predict the structure of the dominant negative feedback loop that governs the recruitment dynamics. In Hanel et al. (2012) ChIP data was analyzed by representing the recruitment process as a regulatory network. The proteins involved in the recruitment process are represented by nodes in this network. By linearizing the regulatory dynamics, one can determine how the binding of a protein affects the affinity of other proteins to bind. By assuming a circular recruitment process that is traversed stochastically in each cell, ? reproduced the binding pattern by a least-square fit of a ChIP signal against simulation data of the recruitment process. This method results in binding patterns that show multiple binding times for most proteins for the pS2 promoter. These binding patterns have been interpreted as stochastic binding events that are seen for example with other experimental methods like GFP (green fluorescent proteins). In Schölling et al. (2013) we generalized this recruitment model to address the question of whether the binding order in a recruitment process is deterministic (sequential recruitment) or stochastic (probabilistic recruitment). In this model the recruitment process is represented by a walk on a network of recuritment states. In the case of sequential recruitment each recruitment state has only one possible successor state (compare Fig. 1) whereas in the case of probabilistic recruitment the topology of the network can look much more complex. We showed that in a cell population the occupation c © Oxford University Press 2013. 1
منابع مشابه
Deciphering transcription factor binding patterns from genome-wide high density ChIP-chip tiling array data
BACKGROUND The binding events of DNA-interacting proteins and their patterns can be extensively characterized by high density ChIP-chip tiling array data. The characteristics of the binding events could be different for different transcription factors. They may even vary for a given transcription factor among different interaction loci. The knowledge of binding sites and binding occupancy patte...
متن کاملiProsite: an improved prosite database achieved by replacing ambiguous positions with more informative representations
PROSITE database contains a set of entries corresponding to protein families, which are used to identify the family of a protein from its sequence. Although patterns and profiles are developed to be very selective, each may have false positive or negative hits. Considering false positives as items that reduce the selectiveness of a pattern, then, the more selective pattern we have, a more accur...
متن کاملBeyond the ENCODE project: using genomics and epigenomics strategies to study enhancer evolution.
The complex expression patterns observed for many genes are often regulated by distal transcription enhancers. Changes in the nucleotide sequences of enhancers may therefore lead to changes in gene expression, representing a central mechanism by which organisms evolve. With the development of the experimental technique of chromatin immunoprecipitation (ChIP), in which discrete regions of the ge...
متن کاملSensitive and accurate identification of protein–DNA binding events in ChIP-chip assays using higher order derivative analysis
Immuno-precipitation of protein-DNA complexes followed by microarray hybridization is a powerful and cost-effective technology for discovering protein-DNA binding events at the genome scale. It is still an unresolved challenge to comprehensively, accurately and sensitively extract binding event information from the produced data. We have developed a novel strategy composed of an information-pre...
متن کاملComparative study on ChIP-seq data: normalization and binding pattern characterization
MOTIVATION Antibody-based Chromatin Immunoprecipitation assay followed by high-throughput sequencing technology (ChIP-seq) is a relatively new method to study the binding patterns of specific protein molecules over the entire genome. ChIP-seq technology allows scientist to get more comprehensive results in shorter time. Here, we present a non-linear normalization algorithm and a mixture modelin...
متن کامل